Effect of Size and Heterogeneity of Samples on Biomarker Discovery: Synthetic and Real Data Assessment

نویسندگان

  • Barbara Di Camillo
  • Tiziana Sanavia
  • Matteo Martini
  • Giuseppe Jurman
  • Francesco Sambo
  • Annalisa Barla
  • Margherita Squillario
  • Cesare Furlanello
  • Gianna Toffolo
  • Claudio Cobelli
چکیده

MOTIVATION The identification of robust lists of molecular biomarkers related to a disease is a fundamental step for early diagnosis and treatment. However, methodologies for the discovery of biomarkers using microarray data often provide results with limited overlap. These differences are imputable to 1) dataset size (few subjects with respect to the number of features); 2) heterogeneity of the disease; 3) heterogeneity of experimental protocols and computational pipelines employed in the analysis. In this paper, we focus on the first two issues and assess, both on simulated (through an in silico regulation network model) and real clinical datasets, the consistency of candidate biomarkers provided by a number of different methods. METHODS We extensively simulated the effect of heterogeneity characteristic of complex diseases on different sets of microarray data. Heterogeneity was reproduced by simulating both intrinsic variability of the population and the alteration of regulatory mechanisms. Population variability was simulated by modeling evolution of a pool of subjects; then, a subset of them underwent alterations in regulatory mechanisms so as to mimic the disease state. RESULTS The simulated data allowed us to outline advantages and drawbacks of different methods across multiple studies and varying number of samples and to evaluate precision of feature selection on a benchmark with known biomarkers. Although comparable classification accuracy was reached by different methods, the use of external cross-validation loops is helpful in finding features with a higher degree of precision and stability. Application to real data confirmed these results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Assessment of "drug-likeness" of a small library of natural products using chemoinformatics

Even though natural products has an excellent record as a source for new drugs, the advent of ultrahigh-throughput screening and large-scale combinatorial synthetic methods, has caused a decline in the use of natural products research in the pharmaceutical industry. This is due to the efficiency in generating and screening a high number of synthetic combinatorial compounds; whereas traditional ...

متن کامل

Assessment of "drug-likeness" of a small library of natural products using chemoinformatics

Even though natural products has an excellent record as a source for new drugs, the advent of ultrahigh-throughput screening and large-scale combinatorial synthetic methods, has caused a decline in the use of natural products research in the pharmaceutical industry. This is due to the efficiency in generating and screening a high number of synthetic combinatorial compounds; whereas traditional ...

متن کامل

Assessment of household reverse-osmosis systems in heavy metal and solute ion removal in real and synthetic samples

In this study, the efficiency of household reverse-osmosis system (HROS) with and without neutralizer accessory was investigated in both real and synthetic samples. The real samples were collected from rural and urban public drinking-water systems with and without primary refinery treatment. The selected areas were situated in the Kurdistan province, Iran. The HROS model RO100GPD with and witho...

متن کامل

Proteomics Applications in Health: Biomarker and Drug Discovery and Food Industry

Advancing in genome sequencing has greatly propelled the understanding of the living world, however, it is insufficient for full description of a biological system. Focusing on, proteomics has emerged as another large-scale platform for improving the understanding of biology. Proteomic experiments can be used for different aspects of clinical and health sciences such as food technology, biomark...

متن کامل

Proteomics Applications in Health: Biomarker and Drug Discovery and Food Industry

Advancing in genome sequencing has greatly propelled the understanding of the living world, however, it is insufficient for full description of a biological system. Focusing on, proteomics has emerged as another large-scale platform for improving the understanding of biology. Proteomic experiments can be used for different aspects of clinical and health sciences such as food technology, biomark...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2012